1
Geometric Foundations of Convex Optimization
MATH008 Lesson 8
00:00
Imagine navigating a complex landscape where your goal is to find the closest point to safety. In the language of optimization, this 'safety' is a set $C$, and the act of finding the closest point is a projection. While intuition suggests there is always a single 'closest' point, mathematical reality is more nuanced. The geometric foundation of convex optimization rests on formalizing 'proximity', moving beyond Euclidean intuition to rigorous functional representations like indicator and support functions.

1. Projections and Distances

The distance from a point $x_0$ to a set $C$ is defined as the infimum of all possible distances to points within the set:

$\text{dist}(x_0, C) = \inf\{\|x_0 - x\| \mid x \in C\}$

The specific optimizer that achieves this distance is the projection operator:

$P_C(x_0) = \text{argmin}\{\|x - x_0\| \mid x \in C\}$

For a simple hyperplane defined by $a^T x = b$, the projection has a beautiful closed-form solution: $P_C(x_0) = x_0 + (b - a^T x_0)a/\|a\|_2^2$. However, for general sets, this remains a constrained optimization problem: minimize $\|x - x_0\|$ subject to $f_i(x) \leq 0$ and $Ax = b$.

2. Functional Geometry

To treat geometric constraints as objective components, we employ two powerful functional 'mirrors':

  • The Indicator Function $I_C(x)$: $I_C(x) = \begin{cases} 0 & x \in C \\ +\infty & x \notin C \end{cases}$. This collapses geometry into a numerical penalty.
  • The Support Function $S_C(x)$: $S_C(x) = \sup_{y \in C} x^T y$. This characterizes the set by its bounding hyperplanes in every direction.
Theorem: Motzkin's Theorem

A nonempty, closed set $C \in \mathbf{R}^n$ is a Chebyshev set (possessing a unique projection for every $x_0$) if and only if it is convex.

Proof Sketch (Uniqueness)

Suppose $C$ is convex and the norm is strictly convex. If there were two distinct closest points $u, v \in C$ with $\|u - x_0\| = \|v - x_0\| = d$, then their midpoint $w = (u+v)/2$ is in $C$ (by convexity).

By the strict convexity of the norm: $\|w - x_0\| = \|\frac{1}{2}(u - x_0) + \frac{1}{2}(v - x_0)\| < \frac{1}{2}\|u - x_0\| + \frac{1}{2}\|v - x_0\| = d$.

This contradicts the assumption that $d$ was the minimum distance. Thus, the projection must be unique.

Caveat 8.4: Norm Dependency

We often construct a separating hyperplane using: $(P_C(x_0) - x_0)^T (x - (1/2)(x_0 + P_C(x_0))) = 0$. Beware! This specific construction is valid only for the Euclidean norm. General norms require more nuanced treatments of orthogonality.

🎯 Core Insight
Convexity is the 'glue' that ensures stability in geometric optimization. Without it, even the simple task of 'finding the closest point' can result in multiple, conflicting solutions, leading to algorithmic instability.